Skip to content

Conversation

jackfrancis
Copy link
Contributor

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

By moving the newly added GKE DRA consts to core it enables them to be re-used in customprocessor tests (to help validate the GKE use case) while also accommodating this change to enable cloud providers to inject custom scale down processors: #8583

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area labels Sep 30, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 30, 2025
@k8s-ci-robot k8s-ci-robot added area/cluster-autoscaler area/provider/gce size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed do-not-merge/needs-area labels Sep 30, 2025
@jackfrancis
Copy link
Contributor Author

/assign @elmiko @towca

Copy link
Contributor

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this would be a fine solution if we don't mind the cloud provider specific constant moving to the core code.

does the gpu_processor_test need that value?

Name: "nodeGPUViaDra",
Labels: map[string]string{
gce.DraGPULabel: "true",
gpu.DraGPULabelGKE: "true",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, sorry for missing this part during review. IMO this test code shouldn't depend on anything GCE/GKE-specific. And it in fact doesn't, this test case doesn't really test anything at the moment. The Node is treated as ready because it doesn't have the GPU label that TestCloudProvider.GPULabel() returns, not because it has the DRA enablement label.

I think something like this would be much better (and would actually test the DRA part of the logic):

  1. Extend TestCloudProvider to allow configuring the result of TestCloudProvider.GetNodeGpuConfig() from the test code.
  2. Change this test to configure the result of TestCloudProvider.GetNodeGpuConfig() differently in different test cases.
  3. Make this one into a test case where the Node does have the TestCloudProvider.GPULabel() label, and the behavior is different based on the result of TestCloudProvider.GetNodeGpuConfig() (the Node is ready if DraDriverName is set, the Node is not ready if ExtendedResourceName is set).

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's a quick & dirty draft of what #1 might look like (and also a sanity check that I'm following your logic):

#8604

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the DraGPULabelGKE is not required for the test, then i'm in favor of using a more neutral label like something from the test provider. as long as we don't introduce the circular dependency ;)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackfrancis Yup, this is exactly what I meant in step 1 - LGTMd the PR!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackfrancis
Copy link
Contributor Author

I'm pretty sure we don't want this PR (though we have an interesting thread convo that we should continue!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler area/provider/gce cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants